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fNj Abstract. Consider a random bipartite multigraph G with n left nodes 

J, and m > n > 2 right nodes. Each left node x has dx > 1 random right 

^ I neighbors. The average left degree a is fixed, a > 2. We ask whether for 

■^^ the probability that G has a left-perfect matching it is advantageous not 

to fix dx for each left node x but rather choose it at random according 

l to some (cleverly chosen) distribution. We show the following, provided 

that the degrees of the left nodes are independent: If A is an integer 
then it is optimal to use a fixed degree of a for all left nodes. If a is 
non-integral then an optimal degree-distribution has the property that 
each left node x has two possible degrees, [aJ and [a] , with probability 
Px and 1 — px, respectively, where px is from the closed interval [0, 1] and 
the average over all px equals [a] — a. Furthermore, if n = c • tti and 
^ A > 2 is constant, then each distribution of the left degrees that meets 

the conditions above determines the same threshold c*(a) that has the 
following property as n goes to infinity: If c < c'(a) then there exists a 
left-perfect matching with high probability. If c > c*(a) then there exists 

v^ no left-perfect matching with high probability. The threshold c* (a) is the 

("-^ same as the known threshold for offline fc-ary cuckoo hashing for integral 

iy~\ or non-integral k — a. 
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en 1 Introduction 

O 

CN We study bipartite multigraphs G with left node set S and right node set T, 

^"^ where each left node x from S has D^ right neighbors. The right neighbors are 

^ chosen at random with replacement from T, where the number of choices D^ is a 

random variable that follows some probability mass function p^- Let jS*! = n and 
^ let |r| = 771 as well as 1 < D^ < m for all x from S. For each x from S let A^; 

J^ be the mean of D^, that is, A^. — X]/!Li ' ' Px{l), s^nd let A be the average mean, 

i.e., A = 1/n ■ X]a;es ^^' ^® assume that the random variables D^^x S S, are 
independent and A is a given constant. 

Our aim is to determine a sequence of probability mass functions {px)xes for 
the random variables {Dx)x£S that maximizes the probability that the random 
graph G ~ G'(a, {px)x(^s) has a matching that covers all left nodes, i.e., a left- 
perfect matching^. We call such a sequence optimal. Note that there must be 
some optimal sequence for compactness reasons. 

* Research supported by DFG grant DI 412/10-2. 

^ In the following we will use "matching" and "left-perfect matching" synonymously. 



1.1 Motivation and Related Work 

Studying irregular bipartite graphs has lead to major improvements in the 
performance of erasure correcting codes. For example in [6] Luby et al. showed 
how to increase the fraction of message bits that can be recovered for a fixed 
number of check bits by using carefully chosen degree sequences for both sides 
of the underlying bipartite graph. The recovery process for erased message bits 
translates directly into a greedy algorithm for finding a matching in the bipartite 
graph associated with the recovery process. This was the motivation for the 
authors of [1,2] to study irregularity in the context of offline A:-ary cuckoo hashing. 
Here one has a bipartite graph with left nodes corresponding to keys and right 
nodes corresponding to table cells, where each key randomly chooses table cells 
without replacement and the aim is essentially to find a left-perfect matching. In 
[1] it was proven that if the degree of each left node follows some distribution 
with identical mean and is independent of the other nodes then it is optimal in 
an asymptotic sense if the degree of each left node is concentrated around its 
mean. This is in contrast of the following observation in [7] in analogy to [6]: an 
uneven distribution of the degrees of the left nodes can increase the probability 
for the existence of a matching that has the advantage that it can be calculated 
in linear time, by successively assigning left nodes to right nodes of degree one 
and removing them from the graph. 

1.2 Results 

We will show that for given parameters n, m, and A there is an optimal sequence 
of probability mass functions that concentrates the degree of the left nodes 
around [aJ and [a] . Furthermore, if A is an integer we can explicitly determine 
this optimal sequence. In the case that A is non-integral we will identify a tight 
condition that an optimal sequence must meet. 

Theorem 1. Let n < ni, as well as n,A > 2, and let {px)xeS be an optimal 

sequence for parameters (n, m, a). Then the following holds for all x € S . 

(i) If A is an integer, then px (A) — 1 . 

(ii) If A is non-integral, then px{\_A\) G [0,1] and Px{\a\) = 1 — P3;([aJ). 

The second statement is not entirely satisfying since it identifies no optimal solu- 
tion. However, we will give strong evidence that in the situation of Theorem 1 (ii) 
there is no single, simple description of a distribution that is optimal for all 
feasible node set sizes. 

Since the case A = 2 is completely settled by Theorem 1 (i), we focus on 
the cases where A > 2, with the additional condition that the number of left 
nodes is linear in the number of right nodes, that is n = c ■ m for constant c > 0. 
We show that for sufficiently large n all sequences that meet the condition of 
Theorem 1 (ii) asymptotically lead to the same matching probability. Therefore, 
we call these sequences near optimal. 

Proposition 1. Let n — c ■ m, for constant c > 0, and let {px)xeS be a near 
optimal sequence with average expected degree A > 2. Then for sufficiently large 



n there is a threshold c*(a) such that the random graph G — G(a, {px)xes) has 

the following property. 

(i) If c < c*(a), then G has a matching with probability 1 — o(l). 

(a) If c> c*(a), then G has no matching with probability 1 — o(l). 

The threshold c* (a) is exactly the same as the threshold given in the context of 
fc-ary cuckoo hashing for integral k [5,4,2], and non-integral k [2], where k = A. 
So in the case that n = c- m aW near optimal sequences are hardly distinguish- 
able in terms of matching probability, at least asymptotically, but we will give 
strong evidence that there are only two sequences that can be optimal, where 
the decision which one is the optimal one depends on the ratio c. 

Conjecture 1. Let {px)x£S be an optimal sequence for parameters (n,m, a) in 
the situation of Theorem 1 (iz) for n = c ■ m and constant c > and A > 2. Let 
a = [a] - A. 
{i) If c < c*(a), then p^jlL-^J) = 1 for a • n nodes and P2;([a]) = 1 for (1 — a) • n 

nodes (assuming that a • n is an integer) . 
(m) If c > c* (a) , then px ( [aJ ) — a and px ( [a] ) = 1 — a for all a; G S*. 

That is, if c is to the left of the threshold then it is optimal to fix the degrees 
of the left nodes, and if c is to the right of the threshold then it is optimal to 
let each left node choose its degree at random from [aJ and [a] , by identical, 
independent experiments. 

Overview of the paper The next section, which is also the main part, covers 
the proof of Theorem 1. It is followed by a section devoted to the discussion of 
Conjecture 1. The proof of Proposition 1 is given in Appendix B, since it is only 
using standard techniques on concentration bounds for nodes of certain degrees. 

2 Optimality of Concentration in a Unit Length Interval 

In this section we prove Theorem 1. We define the success probability of a random 
graph as the probability that this graph has a matching. Let n, m and A be fixed 
and consider some arbitrary but fixed sequence of probability mass functions 
{Px)xes- We will show that if this sequence has certain properties then we can do 
a modification, obtaining a new sequence {p',^)x£S with the same average expected 
value A, such that G{a, {p'.^)x£s) has a strictly higher success probability than 
G{A,{px)xes)- 

Lemma 1 (Variant of [2, Proposition 4]). Let {px)xes be given. Let z £ S 
be arbitrary but fixed. If in pz two degrees with distance at least 2 have nonzero 
probability then {px)xeS ** not optimal. 

The lemma was stated in [2] and proven in [1] for a slightly different graph 
model. Its proof runs along the lines of [1] ; it is included in Appendix A for the 
convenience of the reader. After applying the first lemma repeatedly one sees 
that in an optimal sequence each left node node has either a fixed degree (with 
probability 1) or two possible degrees with non-zero probability, where these 



degrees differ by 1. The lemma and [1,2] do not say anything about the relation 
between the degrees of different nodes. This follows next. 

Lemma 2. Let {px)xes ^^ given, where for each x Cz S the only degrees with 
nonzero probability are from { [a^jJ , [a^:] }. Let y, z G S be arbitrary but fixed. If 
\_Ay\ and \_Az\ have distance at least 2, or \Ay~\ and [a^] have distance at least 2, 
then {px)xes is not optimal. 

Lemma 2 is proved in Section 2.1. Using Lemma 2 one concludes that an optimal 
sequence restricts the means A^,, for each a; S S", to an open interval (Z — 1, Z + 1) 
for some integer constant I > 2. Hence all degrees that appear with non-zero 
probability must be from {I — 1,1,1 + 1}. With the help of the next lemma one 
concludes that actually two values are enough. 

Lemma 3. Let {px)xes be given, where for each x Cz S the only degrees with 
nonzero probability are from {\_Ax\ , [a^;]}. Let y,z € S be arbitrary but fixed and 
assume that Ay and A^ are non-integral. If \Ay\ and [a^J have distance 2 then 
{Px)xes is i^ot optimal. 

Lemma 3 is proved in Section 2.2. Combining Lemmas 1, 2, and 3, we obtain the 
following for an optimal sequence. If Z < A < / + 1 then it holds I < Ax <l + 1, 
for all X Ci S, and all degrees that appear with non-zero probability must be from 
{/, I + 1}. If A is an integer, then by definition of A, we have Px{a) — 1 for all 
X Ci S. Hence Theorem 1 follows. 

So, to complete the proof of the theorem, it remains to show the three lemmas, 
which is done in the following two sections for Lemmas 2 and 3, and in Appendix A 
for Lemma 1. We make use of the following definitions. 

For each set S' C S let Gs' be the induced bipartite subgraph of G with left 
node set S' and right node set T, particularly Gs = G. A matching in Gs' is a 
matching that covers all left nodes (left-perfect matching). We define Ais' as 
the event that Gs' has a matching. 

2.1 Average Degrees of Different Nodes are Close 

In this section we prove Lemma 2. Consider the probability mass functions py 
and pz for the degrees Dy and Dz respectively. By the hypothesis of the lemma, 
Py and pz are concentrated on two values each, i.e., 

Py{k) =p, py{k + 1) = 1-p pz{l) = q, Pz{l + 1) = 1-q , 

with p £ [0, 1] and q £ [0, 1]. By the assumption, we may arrange things so that 
k — I > 2 and 

(j) k = [AyJ, I = [a^J as well as p = 1 - (Aj^ - [a^J), g = 1 - (a^ - [a^J), 
or (ii) k + 1 ^ \^v^ , I + 1 = \Az~\ as well a,s p — \Ay~\ —Ay, q — \Az~\ — A^. 

We will show that changing py to p' and pz to Pz such that a' = Aj, — 1 and 

A^ = A^ -I- 1, via 

p'Jk-l)^p, p'{k) = l-p p'zil + l) = q, p'z{l + 2) = l-q. 



will strictly increase the probability that Gs has a matching, while it does not 
change A. For this, will show 

Pr {Ms \Py, P.) >Pt {Ms \p'y,p',) , 

abusing condition notation a little to indicate changed probability spaces. We fix 
the neighborhood N^ for the remaining elements x € S — {y, z} and therefore the 
graph Gs-{y.z}- Since there can be a matching for S only if there is a matching 
for S — {y, z} it is sufficient to show that 

Pr {Ms \ Ms^{y,z},Py,Pz) >Pl- {Ms \ Ms-{y^z},Py,P'z) ■ (1) 

Let Fai\{dy,dz) = Pr {M \ Ms-{y,z},Dy = dy, Dz — dz). Then (1) holds if and 
only if 

Y^ Fai\{dy,dz) ■ py{dy) ■ pz{dz) > ^ Fail(dy,d^) • p'y{dy) ■ f!,\dz) . (2) 

d«e{(,i+i} d^e{z+i,i+2} 

Note that \ih — I — 2 then the summand regarding dy — k and dz — I + \ oti the 
left-hand side is the same as the summand regarding dy = k ~ I and dz = I + 2 
on the right-hand side. Hence, to prove (2) it is sufficient to show that 

Fail(fc,0 > Fail(/c- 1,^-1-1) . (3) 

For this, consider the fixed graph Gs-{y.z} ■ We classify the right nodes of Gs-{y^z} 
according to the following three types: 

— We call V blocked if v is matched in all matchings of Gs-{y.z}- 

— We call V free if v is never matched in any matching of Gs-fj/.z}- 

— We call V half-free if v is neither a blocked nor a free node. 

Let B be the set of blocked nodes, let F be the set of free nodes, and let HF 
be the set of half-free nodes. Elements oi B = F U HF are called non-blocked 
nodes. For a moment consider only the non-blocked nodes. For each right node 
set y C _B let Hy be an auxiliary graph with node set V that has an edge 
between two nodes f i, ti2 € ^ if and only if there exists a matching for Gs-iy^z} 
in which vi and V2 simultaneously are not matched. Let V be an arbitrary but 
fixed subset of B. The following observation is crucial. 

Claim 1. If Hv has any edges at all then it is connected. 

Proof of Claim. First note that if there is a free node in V then Hy is 
connected by definition of the edge set of Hy- Therefore it remains to consider 
the case where all nodes of V are half- free nodes. It is sufficient to show that 
if for three nodes wi, f2, ^3 from HF the edge (wi, V2) is in Hy then one of the 
edges (wi,W3) or (w2,f3) must be present as well. Assume for a contradiction 
(t!i,U2) is an edge but W3 is neither adjacent to vi nor to W2- This implies that 
there are two matchings in Gs-{y.z}: ^ and M' say, such that in M 

— node W3 is unmatched (^3 is a non-blocked node), but 

— nodes vi and V2 are matched since edges {vi,v^) and (^2,^3) are not in Hy, 
and in M' we have: 

— node v^ is matched (ws is a half- free node) , but 

— vi and V2 are unmatched since edge (wi, V2) is in Hy. 



Now consider the bipartite multigraph M U M' consisting of all edges from both 
matchings and the corresponding nodes. The graph Af U M' has the following 
properties: Nodes on the left side have degree 2 (both matchings are left-perfect). 
Nodes on the right side have degree 1 or 2, in particular, wi,W2,i'3 have degree 1. 
Hence M U M' has only paths and cycles of even length. On all paths and cycles 
edges from M and M' alternate. Nodes vi and V2 must be at the ends of two 
distinct paths (since both are incident to M-edges). Node v^ must be at the end 
of a path (incident to an M'-edge). 

Without loss of generality, we may assume that vi and V3 do not lie on the 
same path. Starting from Af ', we get a new matching in which neither vi nor V3 
are matched by replacing the Af '-edges on the path with v^ by the A/-edges on 
this path. Therefore there must be an edge (ui, V3) in Hy, which contradicts our 
assumption, proving the claim. D 

Now consider the set B of non-blocked nodes and the corresponding graph Hj^ . 
We define ^ as the following binary relation: vi ^ V2, for nodes vi and V2, if 
(111,^2) is not an edge in Hjj. 

Claim 2. The relation ^ (no edge) is an equivalence relation. 

Proof of Claim. Clearly ^ is reflexive and symmetric. Assume for a contradic- 
tion ^ is not transitive. That is, we have three nodes vi, V2 and v^ with vi ^ V2 
and V2 ~ ^3 but Vi / V3. Let V — {vi,V2,V3}. Since vi / W3, the edge (vi^v^) is 
in Hg and therefore in Hy- According to Claim 1 Hy must be connected, i.e., 
Hy and therefore Hj^ must contain {vi, V2) or {v2, va)- Hence vi 7^ V2 or V2 7^ ^3, 
which is a contradiction. D 



According to the claim it follows that the right node set T of Gs_ 



{vM 



can be 



subdivided into disjoint segments B U IiU I2U ■ ■ ■ ~ T, where B is the set of 
blocked nodes and Ii, I2, ■ ■ ■ are the maximal independent sets in Hg and the 
equivalence classes of ~, respectively. For each pair Is,It, with s ^ t, it holds 
that Hj^Dj^ is a complete bipartite graph. Note that each free node leads to a 
one-element set Ig. With this characterization of if 3 we can express the event 
that for a fixed neighborhood N^, x ^ S — {y, z}, which admits a matching for 
Gs-{y.z}, there is no matching for Gs as follows 

{Ny CB}U {N, CB}U [jiiNy U N,) C (S U /,)} . (4) 



Let BIs-{y,z}{b,r,ii, 



be the event that Gs- 



{vM 



has \B\ 



many 



blocked nodes and r (nonempty) maximal independent sets according to the 
definition above, with I/,! — i^ and zi < 12 < . . . < v. Let 



fa.il{dy,dz,b,r,ii, 

Pr (M I Ms- 
Then (4) implies that 

ia,il{dy,dz,b,r,ii,... 



\Ij\ ^ij 

..,ir) 

D 



{v,^} 



y 



dy,Dz 



E 



+ b 



dz,BIs-{y,z}{b, r,ii,..., v)) 



dz 
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Using the law of total probability we can rewrite the value Fail(dj,, dz) (line below 
(1)) as follows: 

Fa.il{dy,dz) = 2, fail(fij,, d^, 6, r, ii, . . . , v) 

(6,,vi,. ..,*.) ■PT{BIs-{y,z}ib,r,ii,...,ir) \Ms-{y,z}) ■ 

We will abbreviate fail(dy, d^, 6, r, zi, . . . ,v) by fail(dy,dz) for the rest of the 
paper. In order to prove (3) it is sufficient to show 

fail(fc,0 >fail(A:-l,/ + l) , (5) 

for each ;BI- vector (6, r, ii, . . . , v). Let 7^ — ij/m and let (3 = h/m. Thus, 

r 

fail(fc, l)=p'^+pi- /3'=+' + ^ [(7, + pf - p'^] ■ [(7, + py - /?'] . (6) 

i=i 
Hence, inequality (5) holds if and only if 

r 

-[(7,+/3f-/3'=]-[(7,+/3)'-/3'] 

^ (1 - /3) . (/3' - /3'=-i) > ^ 7, . [/?' • (7, + /3)'-' - fi"-' ■ (7, + /?)'] • (7) 

j=i ^ ^ ' 

Note that if r = 1 there is no matching for Gs- Hence we are only interested 
in the case r > 2, which implies that ij < m — b and 7^ < 1 — (3, respectively. 
Consider the right-hand side of the inequality. The expression within the square 
brackets increases monotonically with increasing 7^, since we have 



57j 



--{k - 1) • /?' • (7, + /3)'='^ - / • f3'-' ■ (7j + py-' > 
^^.(7,+/3)'=-'-i>/3'=-'-\ 



and the last inequality holds because of fc — Z > 2 and Jj + (3 > /3. Therefore 
replacing 7^ with 1 — /3 within and using that X^i^i 7i ~ ^~ P strictly increases 
the right-hand side of (7) and yields the left-hand side of (7). But since we 
assume jj < 1 — (3 the strict inequality holds. Due to the fact that the event 
{r > 2} has positive probability Lemma 2 follows. ■ 

2.2 Optimal Distributions Use Only Two Neighboring Degrees 

In this section we prove Lemma 3. Consider the probability mass functions py 
and pz for the degrees Dy and Dz respectively. Let [a^J = I and [a^J = Z — 1 as 
well as p = 1 — {Ay — [Aj^J ) and q = 1 — (a^ — [a^J ). By the hypothesis of the 
lemma we have 

Py{l) =p, Py{l + 1) = 1 -p pz{l- 1) = q, Pz{l) = l-q , 

with p (z (0, 1) and g G (0, 1). To prove Lemma 3 we will show that changing py 
to Py and pz to p'^, via 

p'Jl) ^p + e, p'il + l) = l-p-e p'zil -l) = q-e, p'z{l) = 1 - g + e , 



for some small perturbation e 7^ will strictly increase the probability that Gs 
has a matching, while it does not change A. Hence as in the proof of Lemma 2 
we will show that 

Pr {Ms \Py, P.) >Pr {Ms \p'y,p',) ■ 

As before we fix the neighborhood N^ for the remaining elements x € S — {y, z} 
and therefore the graph Gs~{y.z}- As in Lemma 2 we conclude that it is sufficient 
to show that for some perturbation term e 7^ we have 

^ Fa.i\{dy,dz) ■ Py{dy) ■ p^{dz) > ^ Fail(dj,,4) • p'y{dy) ■ p'^id^) . 

dye{Li+i} dye{i,i+i} 

d.^e{i-i.i} d^e{i-i,i} 

Subtracting the left-hand side from right-hand side gives 

[-e^ ~e-{p-q)]- [Fail(/, / - 1) -I- Fail(/ + IJ)- Fail(Z, /) - Fail(/ -I- 1, ^ - 1)] 

-£• [Fail(/ + 1 J - 1) - Fail(?, /)] < 
^ ^ ^ 

^ -e^-Ko-e-[{p-q)-Ko + Ki] <0. (8) 

" V ' 

L 

From (3), which was proven in Lemma 2, it follows that Ki > 0. There are three 

cases. 

Kq = 0. Since we have Ki > 0, it is easy to see that (8) holds for e > 0. 

Kq > 0. Regardless whether L is zero, positive, or negative, (8) holds for some 

small e y^ 0. 
Kq < 0. The only critical case would be L = 0, but we will show that it holds 

Ki > —Kq and therefore L > 0, implying that (8) holds for small e > 0. 

Inequality Ki > —Kq holds if and only if 

Fail(/ + 1,0+ Fail(?, ^ - 1) > 2 • Fail(/, I) . 

As before we will simply show the sufficient condition 

fail(; + 1,0+ fail(0 / - 1) > 2 • fail(0 /) . 

Using (6) in combination with the substitutions jj — ij/ni and f3 = b/rn the 
condition can be written as 



(1 - ^r ■ [p'-' - P''-'] >j2ii~pr- [(7, + py ■ p'-' - /?' 



\/-l all 



Note that the subtrahend of the right-hand side is non negative. Hence it is 
sufficient to show that 

(1-/3)2. [/3'-l-/32'-l]> (1-/3)2. ^(^^.+;3y.;3'-l_,.(l_;3)2. ^2.-1. (9) 



Bounding 'Yl\=i{lj + PY using the binomial theorem gives 

j = l j = l 1=0 ^ ^ j=l ^ ^ j = l 



<../3'+EC)-/?'-' 



1 ^ ^ Lj=i J 



E 



= (r-l)-/3' + l 



where the last step follows from X^^i 7i — 1 ~ /^- Substituting X]^=i(7i + /^)' 
with (r — 1) • /?' + 1 shows that (9) holds and thus the lemma. ■ 

3 A Conjecture: Essentially Two Different Strategies 

In this section, we give evidence for Conjecture 1, which says that essentially 
two types of distributions may be optimal: one in which all keys are given fixed 
degrees I or Z + 1, and one in which each node chooses one of / and Z + 1 at 
random, governed by the same distribution on {I, I + 1}. We indicate under what 
circumstances the one or the other is best. 

Assume we are in the situation of Theorem 1 {ii), i.e., I < A < I -\-\ for some 
integer constant I > 2 and it holds Px{l) G [0, 1] and px{l + I) = I — PxiO^ ^'^^ 
each X from S. Let y and z be two arbitrary but fixed elements of S with 

Py{l) =p, Py{l + 1) = l-p p^H) =q, Pz{l + l) = l~q , 

for p e [0, 1] and q E [0, 1]. We would like to know if the matching probability 
increases if we change the probability mass functions py and pz to p' and p'^ , via 

p'y{l) ^p + e, p'yil + l) = l-p-e p',{l) = q-e, p',{l + 1) = 1 - 9 + e , 
for some e > 0. We note the following. 

1. If p > g, i.e., Ay < Az, this modification would move both means towards 
the boundary of the interval [I, / + 1]. Moving a mean beyond the boundary 
cannot increase the matching probability since this would be a contradiction 
to Lemma 3. 

2. If p < g, i.e., Aj, > A^, this modification would move both means towards 
each other. 

As in Lemma 3 it can be shown that the matching probability increases iff 

^ Fail(dy, 4) • Py{dy) ■ pz{dz) > ^ Fai\{dy, 4) • Py{dy) ■ ^'^(4) . 
dyM^e{i,i+i} dy,d^e{i,i+i} 

This inequality is equivalent to 

[-e^ - e • (p - 9)] • [Fail(/, I) -2- Fail(/, I + I) + Fai\{l + f , / + 1)] < , (10) 

"^ v ■' 

K 

utilizing the symmetry Fail(Z + 1,0 = Fail(Z, I + 1). Whether inequality (10) holds 

or not depends on K, which is independent of y, z and p, q. There are three cases. 

Kq = 0. The modifications to py and pz do not change the failure probability. 

This case seems unlikely since there would be an infinite number of optimal 

sequences of probability mass functions; hence we will ignore this case for 

the rest of the discussion. 



K > 0. Arrange that p > q (ii necessary interchange y and z). Then increasing 
p and decreasing q (moving the means away from each other) increases the 
success probability. 

K < 0. Arrange that p < q {ii p = q do nothing). Again, increasing p and decreas- 
ing q (moving the means closer together) increases the success probability. 

Using the same method as in Lemmas 2 and 3 it is not possible to show that 

always K < or always K > happens. To see this, we try to show i^ > which 

is equivalent to proving that 

Fail(/, /) + Fail(; + 1,1 + 1) > 2- Fail(/, / + 1) . (11) 

As before we only consider the sufficient condition 

fail(?, /) + fail(/ + 1, / + 1) > 2 • fail(?, ^ + 1) . (12) 

This inequality is equivalent to 

2 • /?' ~ ^2i ^ ^[(7, + /?)' - /3f + 2 • /3'+i - /32'+2 + ^[(7, + /?)'+! - (3^+']^ 

r 

> 2 • /3' + 2 • /?'+! - 2 • /?2'+i + 2 • ^[(7, + I3y - /?'] • [(7^- + (3)'+^ - f3'+^] , 

where we use the substitutions jj — ij/m and (3 = h/m. Moving the ^ -terms 
to the left and the remaining /3-terms to the right gives 

E [(7, + /?)' • (1 - 7, - /3) - /3' • (1 - /3)]S /3^' • (1 - /3)^ 

i=i 
However, this inequality may hold or may not hold depending on 7^ and j3. For 
example, consider the events 

1. {/3 = 0}, then the inequality is true for all r > 2, and 

2. {r — 2, 71,72 = 1/(2 • ^), /3 = 1 — 1//}, then the inequality is false. 
Note that events 1. and 2. have positive probability. 

It follows that there exists graphs Gs_{yz} in which (12) is true as well as 
graphs in which (12) is false. Hence, it could be possible that there are nodes 
2/1 , Zi with K < Q (their means should be made equal) , and nodes y2 , 2:2 with 
K > Q (their means should be moved away from each other). So hypothetically, 
it could be optimal when S is subdivided into 3 disjoint sets Si,Si+i, and Si^i+i 
where each node from Si has fixed degree I and each node from Sij^^i has fixed 
degree I + 1 and each node from Si^i+i has the same mean AG {1,1 + 1), and the 
degree of each node is concentrated on I and ^ -f 1. But this would mean if we 
assume such an "optimal situation" and we have three different nodes, say yi, 2/2 
and z, where j/i, 1/2 G Si^i+i and z € 5*;, then it must hold K > for Gs-{yi^,z} 
and K <Q for Gs-{y^ .y^} which seems unlikely since S — {yi , z} and 5* — {j/i , 2/2} 
differ in only one node. (Recall that K = Q does not seem plausible, either.) 
Therefore we conjecture that it is optimal if it holds 
1. either S — Si^J SiJ^i, that is for each x from S the mean A^ is fixed to one of 
the interval borders I and Z -I- 1, and therefore a fixed fraction of [a] — a of 
the nodes have degree I (assuming that A • n is an integer). 
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2. or S" = 5*;^;+!, that is it holds A = A^; for each x from S, and therefore the 

number of nodes of degree / follow a binomial distribution with parameters 

n and [a] — A. 

For the rest of the discussion we only focus on these two degree distributions 

(fixed and binomial) and we try to argue under which conditions case 1 is optimal 

and when case 2 is optimal. 

Again our starting point is (11) and i^ > respectively. Now fix the degree 
of all left nodes from Gs and let a be the fraction of nodes from S with degree I 
as well as let let a' be the fraction of nodes from S ~ {y, z\ with degree I. Then 
there are three situations to distinguish according to the degrees of y and z. 
(i) a = a' + 2/n, that is y and z have degree I, 

{a) a = a' + l/n, that is one node has degree I the other node has degree I + 1, 
{Hi) a = a', that is both nodes have degree / + 1. 

Inequality (11) states that the increase of the failure probability from (ii) to (i) 
is larger than the increase of the failure probability from (iii) to (ii) for all a' 
from [0, 1], that is, the failure probability as a function of a should be convex 
(while strictly monotonically increasing) . Experimental results as shown in Figure 
1 suggest that this is not the case in general. In fact there are two different 
situations for fixed A shown in Figures 1(a) and 1(b). Let f{a) denote the failure 
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(a) c < c*(a), c = 0.956 (b) c > c*(a), c = 0.958 

Fig. 1. Rate of random bipartite graphs with Dx G {3, 4}, a = 3.5, m — 10^, that have 
no matching, as a function of a (the fraction of left nodes with degree 3). The plots 
show that the failure function f{a) has probably a transition from convex to concave. 
The theoretical threshold c*(3.5) is approximately 0.957. 

probability as a function of a. If c < c*(a) then / is convex in a neighborhood of 
[a] — A. Using Jensen's inequality it follows that the failure probability for fixed 
degree distribution /([a] — a) is smaller than the failure probability according 
to the binomial distribution Xir^o /('/") ' (?) ' ([^1 - a)' • (1 - [A] + a)""*, 
ignoring the right tail of the binomial distribution that reaches the concave part 
of /(a), since the tail covers only an exponentially small probability mass. If 
c > c*(a) then / is concave in a neighborhood of [a] — A and the binomial 
degree distribution leads to a smaller failure probability than the fixed degree 
distribution. 
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failure rate of "fixed degree" among 10*' rjindom graphs 

MINUS failure rat(^ of "Ijioomial degree" among 10'' random grapha 



In order to back up this observation, an additional experiment was done which 
directly compares the failure rates for degree distributions around the threshold. 
The results are shown in Figure 2. They confirm the conjecture that if c < c*(a) 
then the fixed degree distribution is optimal, and if c > c*(a) then the binomial 
degree distribution is optimal. 

Fig. 2. Difference of the failure rate '■ 
of graphs G(a, (p^)a:gs) with D^ G '- 
{3, 4}, A = 3.5, m = 10^ and different c 
(Px)a^eS: as a function of c = n/m. c 
Minuend is the failure rate using fixed 
degree, that is Pi(3) £ {0, f}, for j, 
each X £ S. Subtrahend is the fail- 
ure rate using binomial degree distri- 
bution that is pxi'i) = 0.5, for each 

4 Conclusion 

We found (near) optimal degree distributions for matchings in bipartite multi- 
graphs where each left node chooses its right neighbors randomly with repetition 
according to its assigned degree. For the case that the number of left nodes is 
linear in the number of right nodes we showed that these distributions give match- 
ing thresholds that are the same as the known thresholds for regular/irregular 
fc-ary cuckoo hashing; and in the case of near optimal degree distributions we 
conjectured the optimal distribution as a function of the rate of left and right 
nodes. 

Acknowledgment. The authors would like to thank an anonymous reviewer 
for pointing out a gap in an earlier version of the proof of Lemma 3. 
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A Degrees Must be Concentrated Around the Mean 

In this Section we prove Lemma 1. Let (px)x<£S be given and consider an arbitrary 
but fixed element z from S with some initial probability mass function p^. We 
will show that if there are two degrees of z, say I and k, with non-zero probability 
and it holds that / < A^ < fc as well a.s k — I > 2, then the probability that there 
is a matching for the whole key set S cannot be maximal. More precisely we will 
show that modifying p^ to p'^ via 

P'AI) = Pzil) - £ p'zik) = Pz{k) - e 

p'^il + l)^p,il + l)+s p'Ak - 1) = p,(fc - 1) + e , 

for e e (0,min{pz(Z),/O2(fc)}], decreases the failure probability, that is 

Pr(^s|p.)>Pr(>(s|p;) , 
while it does not change A^ and A. For each element x G S — {z} we fix its degree 
and neighborhood Nx- The resulting graph Gs_!z\ can have zero, one or more 
matchings. Let _B C T be the set of right nodes of Gs-{z} that are matched in 
every matching for S — {z}. Since there can be a matching for S only if there is 
a matching for S — {z} it is sufficient to show that 

Pr {Ms I Ms-{.},P.) > Pi- {Ms \ Ms-{z},Pz) ■ 
Using the law of total probability we get 



(13) 



n-l 

E 

6=0 

n-1 



Pr{Ms\Ms-{z},Pz,\B\^b)-PY{\B\^b\Ms-{z},p.) 



>\Pt{Ms \ Ms^{z},p'z,\B\ =b) -Ft {\B\ = b\ Ms-{z},p'z 



b=0 



In order that Gs has a matching there must be at least one node in the neigh- 
borhood Nz of z that is not an element of B. Therefore we have to show 

d' 

■Pr{\B\^b\Ms-{z},Pz) 



Tl-l 



> 



6=0 

n-1 

E 

6=0 



.d=l 



Pz{d) 



m 



J2p'm 



Pr{\B\=b\Ms-iz},Pz) 



Note that B is independent of z and pz, respectively, and if 6 = the modification 
from Pz to p'^ does not affect the failure probability. Hence we consider only the 
cases where 6 > and it remains to show 

d m 



d=l 



Pz{d) 



> 



d=l 



P'zid) 



<^ 



4^ 



> 



> 



i+i 



m ^ 

fe-i 



fc-i 



which is true since < b/m < 1, k — I > 2. Since the event {b > 0} has positive 
probability, inequality (13) holds. This finishes the proof of Lemma 1. ■ 
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B Asymptotic Behavior and Thresholds 

In this section we give the proof of Proposition 1 . Let n = c • m for c > and let 
{Px)xes be a near optimal sequence of degree distributions with A = a • [aJ + 
(1 — a) • [a] > 2 for a £ (0, 1]. Consider the random graph G(a, {px)xes) where 
each left node has D^ G {[^Ji [^11 random neighbors (not necessarily distinct) 
and Dx is distributed according to p^ where it holds a = 1/n ■ XlaiGS /'^(L'^J)- 

Let I = [aJ . We consider a new random bipartite graph G{1, a) with n left 
nodes and m right nodes where a constant fraction of a left nodes has degree 
I, a fraction of 1 — a left nodes has degree ^ + 1, and the neighbors of each left 
node are chosen uniformly at random without replacement. In summary, for 
G the degrees of the left nodes are randomly chosen and duplicate neighbors 
are allowed; for G the degrees of the left nodes are fixed and the neighbors are 
pair wise distinct. 

Now, for each x from S let Fj be a binary random variable with YJ^ — 1, if 
the neighborhood set N^ of x has size / and Y.j. = 0, if N^ has size strictly smaller 
than /. Furthermore let F' = X^^es '^x- Then 

E(FJ) = pS) ■ ^^ + (1 - Px{l)) ■ ^'' ]J'\ and (14) 

m'' m''^^ 



[i + iy. 



EiYt') = {l-Px{l))-^^ ,^, 

jYll + l 

Consider the events 

1. A = {n ■ a — n^ < Y^ < n ■ a + n^} and 

2. B = {n-{l-a)-n^ < Y^+^ < n ■ {1 - a) + n^}, 

stating that the number of left nodes with neighborhood size I and / + 1 is near 
n ■ a and n ■ (1 — a), respectively. 

We want to bound the probability of Pr{A U B) using the complementary 
event AnB, via Pr(^ n B) < Fi{A) + Pt{B). 

First consider the event A. Let Yr^ — Y^ and let Y — Y'^ as well as let 
Px = Pr{Yx = 1). According to (14) it holds that 

Px = E(i;) = p,(0 • (1 - 0(l/m)) + 0(l/m) , 

since 1 — P/m < (Y) ' l^-/m^ < 1 — i/fn, where the lower bound follows from 
Bernoulli's inequality. 

For each x G S let Z^ — Y^ — Px- Now fix an arbitrary order of the left 
nodes, i.e., let 5 — {xi,X2, . . . , a;„}. It holds that Xo,Xi, . . . ,X„ with Xq = 
and Xi — Xi_i + Z^^ is a martingale with bounded differences, since 

E(Xi+i I Xq, . . . , Xi) = E{Xi + Zxi^i I Xq, . . . , Xi) = Xi 

and |Xi_(-i — Xi\ < 1. Applying a standard Azuma-Hoeffding inequality [3, 
Theorem 5.1] we get 

PT{\Xn~Xo\ >n'^) =Pr(|y-E(r)| > n^) < 2 • e^^V'V" _ 

That is for 7 > 1/2 the probability that number of left nodes that have a neigh- 
borhood set of size I differ more than n'^ from its expected value is exponentially 
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small in n. By linearity of expectation, it holds 

E(y) = ^p, = (1 - 0{l/m)) ■ J2 pAI) + 0{1) , 
xes xes 

and since a = 1/n • J^xes P^i^) ^^ follows that E{Y) — n-a± 0{1). Thus, one can 
conclude that if 1 > (5 > 7 > 1/2 then the probability of event A is exponentially 
small in n. Essentially the same proof shows an exponentially small bound for B. 
Hence the probability of the event Ai[G(^A, {px)xes)] that G has a left-perfect 
matching can be bounded via 

Pi{M[G{A,{p,),es)]) 

= Pr {M[G{a, (p,),es)] \AUB)-(l- 0(e-""")) + 0(e-""") • 



Now consider the graph G{l,a). From [2, Theorem 3] it follows with similar 
concentration bounds as above that there is a constant c*{l,a) such that for 
n — > 00 we have, if c = n/m < c*{l,a) then the probability that G{l,a) has a 
matching goes to 1 and if c > c*{l,a) then the probability that G{l,a) has a 
matching goes to 0. The point of transition from success to failure is exactly the 
point where the 2-core of the corresponding liypergraph, which is the largest 
induced sub-hypergraph that has minimum degree at least 2, gets edge density 
larger than 1; see e.g. [5,4] for the case of hyperedges of only one size and [1,2] 
for the generalization to hyperedges of different sizes. If the 2-core is not empty 
then its number of edges is linear in n and its number of nodes is linear in m. 
Now assume that the event AU B takes place. Let G' be the induced subgraph 
of G that covers each left node and its neighborhood if the left node has either I 
or ^ -I- 1 pairwise distinct neighbors. The 2-core of the hypergraph regarding G" 
has asymptotically the same density as the 2-core of the hypergraph regarding 
G. But since the 2-core has linear size or is empty it follows that the 2-core of G 
has asymptotically the same density too. Hence the proposition follows. I 
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